Pose Estimation

The Best 28 Pose Estimation Tools in 2025

SuperPoint is a self-supervised trained fully convolutional network for interest point detection and description.

Pose Estimation

magic-leap-community

Vitpose Base Simple

ViTPose is a human pose estimation model based on Vision Transformer, achieving 81.1 AP accuracy on the MS COCO keypoint test set, with advantages such as model simplicity, scalable size, and flexible training.

Pose Estimation

Transformers English

Vitpose Plus Small

ViTPose++ is a vision Transformer-based human pose estimation model, achieving outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark.

Pose Estimation

Vitpose Plus Base

ViTPose is a vision Transformer-based human pose estimation model that achieves an outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark with a simple design.

Pose Estimation

Transformers English

Superglue Outdoor

SuperGlue is a graph neural network-based feature matching model for matching interest points in images, suitable for image matching and pose estimation tasks.

Pose Estimation

magic-leap-community

Vitpose Plus Huge

ViTPose++ is a vision Transformer-based foundational model for human pose estimation, achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set.

Pose Estimation

img2pose is a Faster R-CNN-based model for predicting the six degrees of freedom (6DoF) pose of all faces in a photo and projecting 3D faces onto a 2D plane.

Pose Estimation

Vitpose Plus Large

ViTPose++ is a vision Transformer-based foundation model for human pose estimation, achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set.

Pose Estimation

Synthpose Vitpose Huge Hf

SynthPose is a keypoint detection model based on the VitPose huge backbone network, fine-tuned with synthetic data to predict 52 human keypoints, suitable for kinematic analysis.

Pose Estimation

Sapiens Pose 1b Torchscript

Sapiens is a vision Transformer model pre-trained on 300 million 1024x1024 resolution human images, specifically designed for high-precision pose estimation tasks.

Pose Estimation English

Synthpose Vitpose Base Hf

SynthPose is a 2D human pose estimation model based on VitPose Base, fine-tuned with synthetic data, capable of predicting 52 anatomical keypoints

Pose Estimation

Reloc3r is a concise and efficient camera pose estimation framework that combines a pretrained dual-view relative camera pose regression network with a multi-view motion averaging module.

Pose Estimation

A vision Transformer-based human pose estimation model achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set

Pose Estimation

Transformers English

Lightglue Superpoint

LightGlue is an efficient keypoint detection and matching model for feature matching and pose estimation problems in computer vision.

Pose Estimation

Reloc3r is a large-scale relative camera pose regression model for visual localization, featuring generalization, speed, and precision.

Pose Estimation

Vitpose Base Simple

This is a keypoint detection model based on transformers, used to identify keypoint positions in images

Pose Estimation

Sapiens Pose Bbox Detector

The RTMDet detector is a high-efficiency detector specifically designed for the Sapiens pose estimation model, intended for human keypoint detection tasks.

Pose Estimation

Sapiens Pose 1b

Pose-Sapiens-1B is a high-resolution human pose estimation model based on the Vision Transformer architecture, pre-trained on 300 million 1024x1024 resolution human images, supporting 308 keypoint detections (body, face, hands, and feet).

Pose Estimation English

Poseless-3B is a vision-language model (VLM)-based robotic hand control framework that directly maps 2D images to joint angles without explicit pose estimation.

Pose Estimation

Sapiens Pose 0.3b Torchscript

Sapiens is a vision Transformer model pre-trained on 300 million high-resolution human images, specifically designed for pose estimation tasks, supporting 308 keypoint detection.

Pose Estimation English

Vitpose Base Coco Aic Mpii

ViTPose is a human pose estimation model based on Vision Transformer, achieving outstanding performance on benchmarks like MS COCO through simple architectural design.

Pose Estimation

Transformers English

Vitpose Base Simple

A lightweight pose estimation model based on ViT architecture for human keypoint detection

Pose Estimation

Sapiens Pose 1b Bfloat16

Sapiens is a vision transformer series model pre-trained on 300 million 1024x1024 resolution human images, focusing on human-centric vision tasks.

Pose Estimation English

Sapiens Pose 0.6b Torchscript

Sapiens is a vision Transformer model pre-trained on 300 million high-resolution human images, specifically designed for pose estimation tasks, supporting 308 keypoint detection.

Pose Estimation English

Diffusion Pusht Keypoints

A robot control model trained using Diffusion Policy, specifically designed for PushT tasks, utilizing keypoint observation data for training

Pose Estimation

Vitpose Base Simple

ViTPose is a baseline model for human pose estimation based on plain vision transformers, achieving high-performance keypoint detection with a simple architecture

Pose Estimation

Transformers English

Sapiens Pose 0.6b

Sapiens is a family of vision Transformer models pre-trained on 300 million high-resolution human images, focusing on human-centric vision tasks.

Pose Estimation English

This model is used to detect keypoints in images or videos, suitable for tasks such as human pose estimation and facial landmark detection.

Pose Estimation

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase